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Abstract;  Wc  will  briefly  outline  a  computational  tlieory  of  tlie  first  stages  of  human  vision  according 
to  which  (a)  the  retinal  image  is  filtered  by  a  set  of  centre-surround  receptive  fields  (of  about  5 
different  spatial  si/cs)  which  arc  approximately  bandpass  in  spatial  frequency  and  (b)  zero-crossings 
arc  delected  independently  in  tlic  output  of  each  of  tlicsc  channels,  /.cro-crossings  in  each  channel 
arc  then  a  set  of  discrete  symbols  which  may  be  used  for  later  processing  such  as  contour  extraction 
and  .stercopsis.  A  formulation  of  Kogan's  zero-crossing  results  is  proved  for  the  case  tif  l-'ouricr  poly¬ 
nomials  and  an  extension  of  l.ogan's  theorem  to  2- dimensional  functions  is  also  proved.  Within  this 
framework,  wo  shall  describe  an  experimental  and  theoretical  approach  {developed  by  one  of  us  with 
M.  Fahic)  to  tlic  problem  of  visual  acuity  and  hypctacuity  of  human  vision.  The  positional  accuracy 
achieved,  for  instance,  in  reading  a  vernier  is  astonishingly  high,  corresponding  to  a  fraction  of  the 
spacing  between  adjacent  photoreceptors  in  die  fovea.  Stroboscopic  presentation  of  a  moving  object 
can  be  interpolated  by  our  visual  system  into  the  perception  of  coniinutms  motion;  and  this  "spatio- 
temporal"  interpolation  also  can  be  very  accurate.  It  is  suggested  diat  the  known  spatiotemporal 
properties  of  tlic  channels  envisaged  by  the  theory  of  visual  procc.ssing  outlined  above  implement  an 
interpolation  scheme  which  can  explain  human  vernier  acuity  for  moving  targets. 

Wc  consider,  in  particular,  the  problem  of  avoiding  aliasing  in  the  perifoveal  visual  field.  It  is 
conjectured  that  gap  junctions  (or  another  form  of  coupling)  between  rods  and  cones  arc  needed  to 
avoid  aliasing  outside  the  fovea.  Possible  implications  for  machine  vision  and  imaging  devices  are 
briefly  discussed. 
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In  the  last  seven  years  a  new  computational  approach  has  led  to  promising  advances  in  the  under¬ 
standing  of  visual  perception.  ITiis  approach,  which  may  be  relevant  not  only  for  Uie  information 
sciences  but  also  for  the  neuroscicnccs.  is  mainly  due  to  the  late  D.  Marr  and  his  colleagues.  In  this 
article  wc  will  briefly  describe  this  computational  theory  for  the  very  first  stages  of  vision,  since  it 
provides  an  useful  framework  for  approaching  the  problem  of  spatiotemporal  acuity  in  human  vision, 
which  is  the  main  topic  of  the  paper.* 


1.1  A  Computational  Approach 

The  central  tenet  of  tliis  approach  is  that  vision  is  primarily  a  complex  information  priKcssing  task, 
with  the  goal  of  capturing  and  representing  the  various  aspects  of  tlic  world  dial  arc  of  use  to  us.  It  is  a 
feature  of  such  Uisks,  arising  from  die  fact  that  die  information  processed  in  a  machine  is  only  loosely 
constrained  by  the  physical  properties  of  the  machine,  that  they  must  be  understood  at  different, 
though  interrelated,  levels.  ITiis  framework,  formulated  by  Marr  &  Poggio  (1976).  was  not  new:  H. 
Simon  and  especially  1..  Harmon  emphasized  a  similar  point  of  view  in  a  more  general  context. 

In  a  prtxcss  like  vision  it  is  useful  to  distinguish  three  levels  over  which  one’s  descriptions  and 
explanations  of  the  process  must  range;  a)  computational  theory,  b)  algorithm,  c)  implcmcntadon. 
These  are  not  hard  and  fast  divisions.  The  important  point  is  that  no  explanaUon  or  set  of  explana¬ 
tions  is  complete  unless  it  covers  this  range.  To  avoid  possible  misunderstandings,  wc  wish  to  stress 
diat  this  computational  approach  is  not  a  substitute  for  the  ‘’traditional"  methods  and  techniques 
of  the  neuroscicnccs  to  which  it  is  in  fact  complementary.  It  is  probably  fair  to  say  that  most 
physiologists  and  students  of  psychophysics  have  often  approached  a  specific  problem  in  visual  per¬ 
ception  with  their  personal  ‘‘computational’’  prejudices  about  the  goal  of  the  system  and  why  it  docs 
what  it  docs.  With  few  exceptions  this  heuristic  attitude,  although  useful,  remained  at  the  level  of 
prejudices;  computational  analysis  was  not  a  science,  nor  was  it  appreciated  in  the  ncurosciences  that 
one  w  as  needed. 

'Some  of  the  material  for  this  paper  has  been  drawn  from  Poggio  (1981)  and  t-ahle  and  Poggio  (1981), 
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ITiis  Slate  of  affairs  is  hardly  surprising,  'fhc  difriculiies  of  ihc  vision  process  are  often  not  ap¬ 
preciated  even  now.  Until  the  early  70's  the  field  of  computer  si'icnce  and  arnficial  intelligence  failed 
to  realise  that  problems  in  vision  are  difficult.  Ihc  reason,  of  course,  is  iliat  we  are  extremely  good  at 
it.  but  in  a  way  which  cannot  be  subjected  to  careful  introspection.  Today  w  e  know  iliat  the  problems 
arc  profound.  “Ad  hoc"  methods  and  tricks  have  consistently  failed.  Man  reali/ed  what  the  message 
was.  A  science  of  visual  information  processing  was  needed  to  analyze  a  given  information  processing 
task  and  its  basis  in  the  physical  world.  Marr’s  work,  from  the  brcadtli  of  the  approach  to  its  rigorous 
detail  in  the  analysis  of  specific  problems,  provides  a  methodological  lesson  for  this  new  field. 


1.2  The  Detection  of  Intensity  Changes 

In  this  section  we  will  outline  one  of  the  very  first  stages  in  the  processing  of  visual  information,  the 
computation  of  zero-crossings.  The  basic  ideas,  outlined  by  Marr  in  a  paper  (1976).  have  evolved 
into  a  scheme  (Marr  &  Poggio,  1977)  based  on  bandpass  filtering  of  ilie  image  through  difference 
of  gaussians  and  detection  of  the  ass(x:iatcd  zero-crossings.  Marr  and  Hildreth  (1980)  have  provided 
a  number  of  attractive  arguments  for  justifying  this  scheme  from  a  computational  point  of  view, 
although  a  complete  formal  theory  is  still  lacking.  We  will  outline  here  their  main  points. 

'Die  goal  of  the  first  step  of  vision  is  to  detect  changes  in  the  reflectance  of  die  physical  surfaces 
around  the  viewer  or  in  the  surface  orientation  and  distance.  On  various  computational  grounds, 
sharp  changes  in  tlie  image  intensity  turn  out  to  be  the  best  indicator  of  most  physical  changes  in  the 
surface.  In  natural  images,  intensity  changes  can  and  do  occur  over  a  wide  range  of  spatial  scales.  It 
follows  that  their  optimal  detection  requires  tlie  use  of  operators  (that  is  filters)  of  different  sizes.  A 
sudden  intensity  change  like  an  edge  gives  rise  to  a  maximum  or  a  minimum  in  the  first  derivative 
of  image  intensites  or  equivalently  to  a  zero-crossing  in  the  second  derivative.  Marr  and  Hildreth 
(1980)  argue  that  the  desired  filter  should  take  the  second  derivative  of  the  image  at  a  particular  scale. 
A  convenient  choice  for  the  derivative  in  two  dimensions  is  the  l.aplacian  ^  and 
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Figure  I.  A  cross-seciion  of  the  circularly  symmetric  centre-surround  receptive  field 
V^. 


the  appropriate  scale  can  be  set  by  filtering  the  image  with  a  2*D  Gaussian  filter  G.  which  optimally 
satisfies  specific  constraints  on  the  real  world,  particularly  the  fact  that  intensity  changes  arising  from 
physical  objects  arc  spatially  localized  at  their  own  scale.  Since  the  operations  of  taking  the  derivative 
and  blurring  an  image  arc  linear,  the  overall  transformation  is  equivalent  to  convolving  the  image 
with  the  l^placian  of  a  gaussian  distribution,  that  is  with  V'^G.  As  shown  by  fig.l,  this  corresponds  to 
a  centre-surround  type  of  receptive  field.  Such  a  filter  closely  resembles  the  usual  descriptions  of  the 
ganglion  cell  receptive  field  and  of  the  psychophysical  channels  in  human  vision  as  the  difference  of 
two  gaussians,  an  excitatory  and  an  inhibitory  one.  Spatial  filters  with  the  centre-surround  organiza¬ 
tion  shown  in  fig.  1,  are  of  course  bandpass  in  spatial  frequency,  although  their  bandwidtli  is  not  very 
narrow. 

In  summary,  the  process  of  finding  intensity  changes  at  a  given  scale  consists  of  filtering  the  image 
with  a  centre-surround  type  of  receptive  field,  with  a  size  reflecting  the  scale  at  which  the  changes 
have  to  be  detected,  and  then  locating  the  zero-crossings  in  the  filtered  image  (see  fig.2). 


To  detect  changes  at  all  scales,  it  is  necessary  only  to  add  other  channels,  of  different  dimension. 
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Figure  2.  The  image  (a)  has  been  convolved  wnh  a  cenire-surround  reccoiive  field  wiih  ihc  shape  illustrated  in  Fig. 
1  (b)  shows  the  convolved  image:  positive  values  arc  shown  white  and  negative  black:  uliiic  (black)  laliies  would  then 
represent  the  activity  o(  the  corresponding  on-(oir-)  centre  ganglion  cells  "looking  '  at  the  image  (c)  the  zero-crossings 
profile  contains  nch  information  about  the  filtered  image  (b>  as  csplaincd  iii  the  text  Similar  indepcnderil  filtcis  of 
smaller  and  larger  sizes  are  needed  to  capture  the  whole  information  contained  in  (a)  f  rom  Marr  and  Hildreth  (1980). 

and  carry  out  the  same  computation  for  each  channel  independently. 

Zero-crossings  in  each  channel  tlius  form  a  set  of  discrete  symbols  which  arc  used  for  later  process¬ 
ing  such  as  stcrcopsis  (Marr  &  Poggio,  1977).  Marr  and  Hildreth,  in  particular,  addressed  the  problem 
of  how  to  combine  zero-crossings  from  different  channels  into  primitive  edge  elements  taking  ad¬ 
vantage  of  physical  constraints  obeyed  by  the  visual  world.  These  and  other  symbolic  descriptors 
then  represent  what  Marr  called  the  “raw  primal  sketch”.  Instead  of  describing  these  parts  of  the 
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theory,  we  shall  discuss  in  more  detail  the  zero-crossing  detection  process  and  the  corresponding 
physiological  and  psvchophysical  evidence.  Zero-crossings  in  the  output  of  centre-surround  channels 
represent  a  natural  way  of  obtaining  a  discrete,  symbolic  representation  of  the  image  from  die  original 
“continuous"  intensity  values.  Some  recent  deep  results  in  complex  analysis  by  ll.Logaii  (1977)  seem 
to  support  tliis  scheme  in  a  way  which  we  found  intriguing  and  fascinating  from  when  we  came  across 
his  remarkable  paper.  His  main  theorem  (  see  Appendix  la)  states  that  a  bandpass  one  dimensional 
signal  with  a  bandwidth  of  less  than  1  octave  can  be  reconstructed  completely  up  to  a  constant  multi¬ 
plication  factor  from  its  zero-crossings  alone  (if  some  relatively  weak  conditions  arc  satisfied).  From 
the  point  of  view  of  visual  information  prcKcssing  there  is  clearly  no  need  to  reconstruct  the  original 
signal.  But  the  theorem  suggests  diat  die  "discrete”  symbols  provided  by  zero-crossings  arc  very  rich 
in  information  about  the  original  image.  Unfortunately,  more  definite  claims  are  as  yet  impossible, 
since  an  extension  of  die  theorem  to  images  (Appendix  la  and  especially  lb;  see  also  Marr  et  al., 
1979)  docs  not  characterize  completely  die  two-dimensional  problem.  In  addiuon.  ccntrc-si.rround 
receptive  fields  arc  not  ideal  bandpass  filters,  as  required  by  Logan’s  version  of  die  dicorem  (see 
Appendices  la.  lb).  Clearly  zero-crossings  alone  do  not  contain  all  the  information  (such  as  absolute 
intensity  values),  but  as  one  of  us  has  found  in  an  empirical  investigation,  natural  images  filtered  with 
V^G  operators  can  be  reconstructed  to  a  gtxid  approximation  from  dicir  zero-crossings  and  slopes. 
A  successful  extension  of  the  Logan  type  of  analysis  to  two-dimensional  patterns  may  therefore  repre¬ 
sent  one  of  the  critical  steps  for  perfecting  this  computational  analysis  of  low  level  vision  into  a  solid 
theory. 


1.3  The  Line  Dctcctors/Fouricr  Analysis  Controversy:  New  Synthesis? 

I'he  previous  ideas  based  on  l.ogan‘s  type  of  results  not  only  lead  to  a  sadsfaciory  scheme  for 
the  analysis  of  intensity  changes  in  an  image;  diey  also  have  fascinating  implicadons  for  visual 
psychophysics  and  physiology,  since  they  seem  to  account  for  basic  properties  of  the  first  pari  of  the 
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visual  pathway.  In  particular  llicsc  ideas  explain  why  tlic  image  is  filtered  early  on  by  approximately 
bandpass  centre-surround  receptive  fields;  they  make  more  precise  the  notion  of  'edge-detectors’' 
for  extracting  a  symbolic  description  which  contains  full  information  about  the  image;  and  Uiey  state 
that  tliis  can  be  achieved  only  if  die  image  was  previously  filtered  with  several  independent  bandpass 
channels  —  i.c.  centre-surround  receptive  fields.  As  an  immediate  consequence  diesc  ideas  also 
provide  a  solution  of  the  long-standing  controversy  about  edge-detectors  versus  frequency  channels 
in  the  psychophysics  and  physiology  of  primate  vision.  Ihc  first  stage  of  vision  would  indeed  be  per¬ 
formed  to  a  good  extent  by  “edge"  detectors  —  actually  /ero-crossing  fetectors  —  and  certainly  not 
by  Fourier  analyzers;  but  in  order  for  die  zero-crossing  detectors  to  extract  meaningful  information 
it  is  necessary  that  they  operate  on  the  output  of  independent  channels,  roughly  bandpass  in  spatial 
frequency. 

Many  results  from  die  psy  chophy  sics  and  physiology  of  early  vision  can  be  c.isily  interpreted  in  this 
new  framework.  It  is,  for  instance,  not  too  unreasonable  to  propose  diat  the  V^G  filtering  stage  is 
performed  by  ganglion  cells  of  the  retina  and  I  GN.  whereas  a  subclass  of  simple  cells  may  represent 
oriented  zero-crossing  segments.  In  this  context  it  is  not  important  how  diis  is  implemented  in  detail: 
one  of  die  several  possibilities  is  that  simple  cells  may  read  die  zero-crossings  profile  from  the  fine 
grid  of  small  cells  in  layer  4C  of  the  striate  cortex,  where  a  reconstruction  of  the  filtered  image,  at 
different  scales,  may  be  performed  (via  intracortical  inhibidon)  with  the  goal  of  providing  a  very 
accurate  posidon  of  the  zero-crossings  (see  later). 

Several  gaps  have  sdll  to  be  filled  in  the  computational  dicory  of  zero-crossings.  For  instance, 
since  zero-crossings  do  not  represent  the  complete  information  about  the  image,  it  is  important  to 
characterize  the  other  primidves  that  are  needed.  At  the  other  levels  of  cxplanadon  experimental 
evidence  in  favour  or  against  zero-crossings  is  of  course  highly  desirable.  Since  the  summer  day  in 
Tubingen  where  D,  Marr  with  one  of  us  first  formulated  the  idea  of  zero-crossings  in  the  output  of 
independent,  roughly  bandpass  filters,  we  cannot  help  feeling  diat  its  experimental  validation  —  or 
falsification  —  is  of  critical  importance  for  further  developments  of  our  approach  to  low-level  vision. 
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SPATIOTEMPORAL  INTFRI'OL ATION 


1.  MnuuI  iiilomiation  proccssin)';  why  sputiotcniporat  intorpoiation? 

Anv  \isu.il  pnvcssor  with  human-level  perfoimance  must  be  capable  of  analyzing  time-varying  im¬ 
agery .  I  he  analysis  suirts  with  the  spatio-temporal  interpolation  of  the  raw  visual  input,  i  he  spatial 
resolutii'n  of  tlie  photosensitive  image  available  for  prtKcssing  is  limited  bv  the  sampling  ilensity 
of  the  photosensitive  elements  in  the  sensor  and  by  noise.  Image  mvuion  introtluces  Ihe  additional 
problem  of  temporal  resolution.  The  limiting  factors  are  the  frame  rate  and  the  integration  iinu  deter¬ 
mined  bv  the  sensitivity  of  the  photosensitive  elements.  Ihis  is  of  little  consequence  for  a  stationary 
scene,  but  for  mov  ing  targets  it  poses  die  problem  of  motion  smear. 

The  problem  of  high  spatiotemporal  resolution  can  be  p.irtially  overcome  h)  using  better  sensors 
with  larger  arrays  and  higher  fr.imc  rate.  Ihere  arc.  hovvever.  technological  and  phvsical  limits  to 
the  spatiotemporal  resolution  that  can  be  achieved  in  tliis  manner,  since  increasing  the  spatial  and 
temporal  s.implmg  rate  reduces  the  number  of  photons  per  sensor  element  pei  e>cle  CmiMdcr  iJiat 
since  the  nunihcr  x  of  photons  is  I’oisson  distributed,  a  =  ihe  mimbci  of  disiinguish.ible  levels 
was  estimated  by  Harlow  (1981)to  be  roughly  n  =  2\/x.  IhusS  bits  of  resolution  (u  -  ;.'>())  requires 
about  2'  '  —  10’  photons.  Note  that  the  light  intensity  of  a  brighi  surface  is  lO’cd/m-’  and  tliis 
means  10  ’  photons  per  50  msec  per  sensor,  assuming  a  sensor  efficiency  similar  to  the  human  cones! 

Fortunately,  the  performance  of  a  given  sensor  can  be  improved  by  appropriate  sp.itiotcmporal 
interpolation  schemes.  As  we  have  seen,  using  such  processes  the  human  visual  system  achieves  an 
cxtrcniclv  high  spatiotemporal  resolution  compared  to  the  sampling  density  of  the  photoreceptors  and 
tlicir  mn'gration  time. 

In  summary  then,  temporal  ticuity,  spatial  acuity  and  motum  smear  arc  difTcicm  facets  of  tlic  same 
general  problem  posed  to  a  visual  processor  by  time  varying  imagery.  Wc  torn  now  to  examine  bow 
tlic  human  visual  processor  deals  with  it 


2.1  Visual  acuity  in  human  vision 
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Since  the  first  measurements  of  vernier  acuity  in  1892  li\  WueHing  in  1  uhingcn,  tlie  extraordinary 
accuracy  with  which  the  human  eye  can  estimaio  the  relative  posiiiuns  of  lines  or  other  features  in 
tlie  visual  field  lias  represented  a  long-suinding  pu//lc  in  vision  research.  Acuity  of  Uiis  type,  also 
called  hyperaeuiiy,  can  be  measured  in  a  variety  of  situations.  .A  typical  example  is  the  acuity  found 
in  reading  a  vernier  (see  inset  of  fig.  8a).  I'hiscan  be  as  fine  as  5"  of  arc  (Wesiheimer  and  McKee 
I9I'5),  that  is  0.02mm  at  1  metre  distance.  The  astimishing  precision  of  ihis  performance  ran  be  seen 
when  the  optical  properties  of  tlie  human  eye  arc  considered  In  the  fovea  the  hexagonal  grid  of  cones 
samples  the  visual  image  with  a  sampling  interval  of  no  less  tlian  25".  well  matched  to  the  optical 
point  spread  function  of  the  eye  (its  gaussiari  core  has  a  h.ilf  w  idth  of  about  45",  corresponding  to  a 
spatial  frecpicncy  of  GO  cyclcs/dcgrcc). 


Most  remarkably  of  all,  vernier  acuity  is  not  atTected  by  movement  at  constant  velocity  of  the 
Uirgct  in  a  velocity  range  from  (f  /sec  to  at  least  ‘f  /see  (Wesiheimer  4:  McKee.  1975).  lliis  means 
tliat  a  subject  can  detect  the  relative  position  of  two  lines  to  within  a  fmetion  of  a  receptor  diameter 
(and  spacing)  while  the  whole  ptutern  is  mov  ing  across  70  receptors  in  150  msec.  Recently,  evidence 
has  been  accumulating  which  suggests  that  the  visual  system  is  able  to  perform  a  very  precise  tem¬ 
poral  interpolation  as  well,  by  rcctinstrueting  die  spatial  pattern  of  acdviiy  at  moments  iincimcdiate 
between  discrete  temporal  presentations  (Barlow,  1979).  Hie  most  telling  dcnuinstralion.  apan  from 
cinematography,  was  introduced  by  1).  Burr  (1979a,  see  also  Morgan,  1980)  and  is  shown  in  the  top 
inset  of  fig.  8c.  Vernier  line  segments  arc  displayed  siroboscopically  at  a  scries  of  stations  to  portray 
a  moving  vernier;  an  illusory  displacement  occurs  if  die  line  segments  arc  accurately  aligned  in  space 
but  are  displayed  with  a  few  milliseconds  delay  in  one  sequence  relative  to  the  other.  Not  only  do  the 
segments  appear  to  move  smoothly  from  one  station  to  the  next  but  also,  between  the  suobes.  they 
arc  seen  to  occupy  positions  between  those  where  Urey  arc  actually  exposed.  The  accuracy  of  detecting 
the  equivalent  displacement  is  again  in  the  vernier  acuity  range,  provided  that  the  target  moves  at 
constant  speed  and  elicits  a  clear  sensation  of  motion.  One  is  forced  to  conclude  that  not  only  spatial 
but  also  temporal  interpolation  is  performed  in  the  visual  system  to  preserve  acuity  (and  resolution) 
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('ori)hjcc(s  III  iTK'tiun  (see  Ikirlow,  1979), 

It  is  clear  that  tlic  attainmetu  i)^ such  spati(>lemp()ja)  accDr.icy  does  not  break  ins  plivsis.il  lass  (sec 
Westlieiiiici,197()).  As  pointed  out  by  llailoss  ( 1979)  and  by  Click  ct  al.  (IhSO).  the  clissiLal  sampling 
tlieoreni  allows  a  coricci  reconstruciism  of  die  visual  input  from  a  set  ofdiscrcie  s,iin|iles  in  space 
and  lime  since  the  I.CIN  signal  is  bandliniiled  in  temporal  <ind  spatial  frenuencs  bs  ibe  phoiiTccepior 
kinetics  and  the  eye’s  optics  respcciiscls .  In  particular.  Crick  ci  al.  base  siiggesud  (si!;''i,ii  Is  to 
Harlow)  tliat  the  fine  grid  of  granule  cells  in  layer  IVc  of  the  striate  coricv  pcrhmns  ,in  inicrpol.ition 
on  the  output  of  the  I.GN  fibres,  with  the  goal  o*"  representing  die  position  of  /cro-crossmas  (the 
boundaries  between  actisity  in  an  ON  and  Ol'l-  ganglion  cell  layer)  with  a  very  high  accnr.i.s  (see 
.ilso  Mart  and  I  tildrelh,  IdgO  and  Man  et  al..  1979). 

Alllnnigb  spatiotcmporal  inicrpol.'tion  can  be  well  understood  in  itinis  of  infonn.iiion  lieory. 
the  astonishing  pcrfonnance  of  the  visual  system  seems  to  require  an  algorulv.n  aiivi  coi  rcsp  niding 
inech.inisms  of  great  ingenuity  and  precision.  As  wc  hinted  earlier,  an  undersi.inding  of  sisual  inter¬ 
polation  niiiy  also  he  quite  interesting  from  a  purely  infi'miation  proees.sing  point  <>1  view  High 
resolution,  smcar-frcc  real  time  im.igery  could  benefit  significanlls  from  diis  studs  ol  human  sision. 
Here  wc  invcsiigatc  some  properties  of  tins  sp.itioicniporal  interpolation.  In  panic  til, ir.  w  c  examine  its 
performance  fora  range  of  "sampling  intenals"  in  spare  and  time. 


2.2  Methods 

The  sernier  target  used  in  these  experiments  consisted  of  a  thin  vertical  bar  made  up  of  tw  o  segments. 
1  he  stimuli  were  generated  on  a  Tektronix  h04  display  under  the  conirt)!  of  .inalog  clecironics .  l-.ach 
bar  was  intensified  for  0.1  msec  at  At  msec  intenals  at  n  successive  stations  hori/onuilK  dispkiced  by 
a  separation  Ax.  Each  of  the  two  segments  making  tip  the  bar  was  24'  liigh  ;ind  1.5'  wide  intensified 
to  a  luinin.inec  of  about  50  times  detection  tlircshold  on  a  background  of  10e(i/m  \  During  an 
cxperinicntal  run.  a  target  was  presented  every  3  seconds.  Brief  displays  of  n  =  15h  msec. 
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symbol  '  O  A  *  □ 

I'  2.5'  7.5'  15'  30' 


velocity  ^eg/s) 


Figure  X  Scrmcr  rc'^iluium  ihicshold  of  spaiial  on’x.t  lot  diilciciu  sopatauoli^  A.r  bciwitn  the  stations  a<  a  function  of  velocity. 
iiy  la  sjioas  (he  data  Ironi  subjeet  AK,  fiii  ifb  lioni  -.ubjett  IV  Ibe  stand-ird  dcviutioi;  of  the  data  is  about  the  Uircshold 

value  for  fit  la  and  ’0%for  fiii  lb  In  fig  .'a  the  point  for  JVj  =  .  1'  and  f  -  Itf/mr  was  measured  masking  the  beginning  and 
the  el  ding  of  the  irajeciorv.  the  same  procedure  did  not  change  the  threshold  for  the  point  at  f  =  '2  l>®/sir  Of  the  two  points 
at  Aj  --  2.!)'  and  o  =  i^/xir  m  fig.  3a,  the  worse  value  has  been  measured  under  the  ma.sking''  condition  wav  whereas  the 
better  one  was  measured  in  the  standard  way  In  (ig  .3b  also  the  point  at  Ax  =  2.5'  and  r  =  2.'5°/«.'  was  measured  with  aero 
olTset  at  the  first  and  last  station  (from  l  ahlc  and  Poggio.  1981) 

wiili  randiimizcd  dircciion  of  motion  (terminating  at  the  central  fixation  point)wcrc  used  to  prevent 
cft'cctivc  pursuit  eye  movements  (Westheimer,  1954).  Ilie  experiments  measured 

a)  the  acuity  for  detection  of  real  vernier  offsets  of  the  two  segments  by  6x  seconds  of  arc 

b)  the  acuiiy  for  detection  of  apparent  vernier  offsets  produced  by  delaying  tJic  presentation  of  the 
lower  or  upper  segment,  displayed  at  the  same  sequence  of  stations,  by  6t  msec 

c)  the  acuity  for  detection  of  mixed  vernier  offsets  produced  by  a  real  spatial  offset  6x  together  with 
a  temporal  delay  Si  of  opposite  sign. 

In  a  forced  choice  task  the  subject  was  required  to  signal  whether  the  bottom  segment  was  dis¬ 
placed  to  the  right  or  to  the  left  of  the  top  segment  by  setting  a  binary  switch.  Acuity  was  determined 
by  the  sundard  criterion  of  75%  correct  identification.  In  all  experiments  reported  here  T  is  constant 
(T  =  ISO  msec)  and.  as  a  consequence,  the  number  of  stations  n  is  variable  (n  =  2  to  95).  More  details 
about  the  methods  arc  given  in  Fahlc  and  Poggio  (1981). 
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symbol  x  o  A  #  p 

Ax  J'  2.5'  7.5'  15'  30' 


Figure  •!.  Vernier  lesolulion  ihresholds  of  icmporal  oIVscl  for  different  sepaialion.ii  tn'iwccn  tbc  .slalions  a.s  a  ftinclion  of  vclocus 
I  ig  4n  shill*:,  the  data  from  subject  AK.  lig  4b  from  subject  TV  ntc  standard  detialion  is  aboul  .■'0%of  ihc  ihieshnid  salue*  for 
subjcti  .\K  and  IS^for  subject  TV  (irom  I'ahle  and  I’ttggio.  1981). 

13  I  hc  Spatial  Type  of  .\cuity:  Dependence  on  Velocity  (v)  and  Separation  (Ax) 

i’hc  results  for  .sjwlitil  offsets  (tvith  sinuilUtitcous  presentation  of  tJtc  twti  segments  at  each  station)  are 
shown  in  figs.  3a,b.  ITic  main  rcstik  is  that  spatial  acuity  is  relatively  independent  of  the  separation 
between  the  stations  and  of  the  vekKity  of  the  target  up  to  rather  large  velocities.  These  data  confirm 
and  extend  Westheimer’s  and  McKee's  results  (1975).  which  showed  that  vernier  acuity  is  unaffected 
by  rate  of  movement  from  (P/acc  up  to  d^/.sec.  Our  results  imply  that  this  type  of  vernier  acuity  is 
relatively  independent  of  A(,  the  strobe  interval. 


2.4  I'hc  Temporal  Type  of  .Veuity:  Dependence  on  v  and  Ax 

Figs.  4a.b  shows  the  results  for  temporal  offsets.  The  accuracy  of  detecting  the  equivalent  displace¬ 
ment  is  in  the  classical  vernier  acuity  range  (compare  Burr,  1979a.b):  the  best  value  for  observer  AK 
was  8"  for  spatial  and  5"  for  temporal  offset  at  comparable  separations  and  velocities.  Our  main  new 
result  is  that  although  acuity  docs  not  break  down  for  large  separations  between  the  stations,  at  least 
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FiRure  5.  I'ni  ?a  shows  the  best  vernier  rcsoltiiKiii  threshold  (with  iem|soral  olTsci)  for  each  separation  At 
Ifie  data  ate  from  three  siibjeets  (partlv  from  fn:  4a  and  41)1  O  AK;  O  TV:  X  IIW.  In  fig  5b  the  velocity 
V  for  which  optimal  vernier  tcsolution  is  found  is  plotted  against  the  separation  At  Same  data  as  in  fig.  5a 
Trom  I  able  and  I’oggio  (WKI) 

up  to  half  a  degree,  it  deteriorates  significantly  almost  in  proportion  to  Ax(sec  fig.  5). 

Vernier  acuity  of  this  temporal  type  is  bad  at  low  and  high  speed.  As  already  clearly  demonstrated 
by  Burr  (1979a.b)  apparent  motion  is  necessary  for  temporal  offsets  to  be  seen  as  spatial  offsets.  In 
our  cxpcrimcnLs.  deterioration  of  acuity  at  low  velocities  could  be  due  to  tile  speed  per  sc  as  well  as  to 
the  lower  number  of  stations  (because  our  touil  presentation  time  is  constrained  to  T  =  150  msec  the 
stimulus  consisted,  at  the  lowest  velocities,  of  two  stations).  In  any  ease,  deterioration  of  acuity  at  low 
vcliKitics  can  be  linked  with  a  decreased  sensation  of  motion. 

A  second  important  result  is  that  the  range  of  velocities  for  which  temporal  interpolation  is  good 
shifts  upwards  for  larger  separations  between  die  stations.  ITic  fact  that  at  higher  separations  higher 
vcliKilics  arc  required  for  good  resolution  suggests  that  a  more  revealing  parameter  is  the  time  inter¬ 
val  At  between  the  strobes.  In  fact,  at  any  separation  Ax,  temporal  interpolation  is  optimal  for  a 
temporal  interval  At  between  20  msec  and  50  msec. 
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2.S  I'hc  F.ITcct  or  Itlur  on  Spatial  and  Temporal  Acuity 

Standard  vernier  acuity  is  known  to  be  affected,  as  one  would  expect,  by  allcnuation  of  die  high 
spatial  frequencies  of  the  vernier  pattern  (sec  for  insunce  Stigmar,  1971).  Is  temporal  interpolation 
also  degraded  in  the  same  way? 

We  have  perfonned  some  experiments  to  answer  this  question  by  placing  a  ground  glass  screen  at 
1  cm  in  front  of  the  display.  When  a  sharp  line  is  viewed  through  such  a  ground  glass  screen  tlte 
resulting  light  distribution  has  an  approximately  Gaussian  line  spread  function  with  a  width  at  half¬ 
height  of  at  least  15’,  corresponding  to  a  cutoff  frequency  of  around  3-4  cycle  /deg.  Our  data  show 
that  in  tlic  experimental  situation  of  fig.  4,  blur  of  tlic  pattern  wnprovcT  acuity  at  large  separations  and 
velocities.  Fig.  6  compares  directly  for  the  Siime  observer  and  for  the  same  separation  the  effect  of 
blur  on  spatial  and  temporal  interpolation.  Westheimer's  type  of  acuity  is  degraded  by  blur,  whereas 
Burr’s  type  of  acuity  improves  dramatically  with  blur  (at  high  velocities).  Out  of  five  observers  only  in 
one  ease  did  blur  of  the  pattern  cause  a  reduction  in  temporal  vernier  acuity  at  high  separations  and 
vcliKitics. 

[■hese  data  again  show  that  temporal  hyperacuity  has  different  characteristics  from  spatial  hyper- 
acuity. 


2.6.  Spatial  vs.  Temporal  Offset 

I'hc  apparent  offset  tfx'  produced  by  temporal  delay  ft  should  follow  the  ideal  relationship  Sx*  = 
v6t.  As  show  n  by  our  data  the  sign  of  the  offset  is  indeed  correctly  detected.  Does  its  size  also  satisfy 
this  relation?  How  faithful,  in  other  words,  is  temporal  interpolation?  lb  answer  this  question  we 
measured  the  temporal  delay  St  needed  to  compensate  for  a  given  real  spatial  offset  Sx  for  different 
conditions. 

Fig.  7  shows  tltat  for  a  separation  Ax  —  2.5'  and  a  vcUxrity  v  =  l.l°/acc  the  apparent  offset 
fix'  —  utft  matches  rather  closely  the  real  spatial  offset  Sx.  Under  these  conditions  spatiotvmporal 
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Fiuurc  6.  Ihe  cllcfi  i>(  bl'.ir  on  .kuI  i- inuTi'i'latiori  as  a  funclion  of 

vcloi'ils  for  a  sopaialioii  I'olssvii;  iIk  'MIu  a  Ji.;  ’  Vci!  !i'r  riMilmion  of  a  spatial 
oIVsct  is  iiu.bjrsd  wuli  (•)  .uni  ssiiiuior  I'liii  (o)  \ctiiicr  tcsoliiiioii  of  a  leinporal  offset 
is  also  sluhst’  wall  (  I  .'.ml  ssithoiit  (  )  blur  ITic  siiocn  was  blurred  .ls  ds'seribed  in  the 
text.  Notice  llr.il  the  fust  point  for  spatial  olUct  is  foi  r  =-■  lf/s<c  Ihe  obsener  i.s  lA'. 

The  standard  dcMainm  is  about  Xi'i-of  the  thieshold  values  I'rom  I'ahic  and  Poggio 
(1981), 

inicrpitlalion  is  indeed  ratiicr  precise  (conip.iif  Burr  Jiid  Ross.  1979).  li  is  nui  so  for  higher  velocities 
and/or  larger  separations  (ftg,  5).  I  he  icmpoial  ollsoi  needed  lo  coinpcns.ilc  for  a  real  spatial  offset  is 
then  much  larger. 


3.1.  Spatiotcniporal  Interpolation;  Mow  is  it  Done? 

The  previous  results  constrain  die  prohicm  tif  hypcraeiniy  tightly  enough  to  justify  a  theoretical 
analysis  of  how  spaliotcmpoi  iil  interpolation  may  he  iloiie  in  the  v  isual  system.  The  precise  meaning 
of  interpolation  in  terms  of  our  visual  stimtili  is  .i  well  defined  question,  and  Uiis  is  the  main  point  to 
discuss. 

.3.1.1.  A  Simple  Illustration 

Fig.  8  illustrates  a  very  simple  scheme  for  .ulnevmf,  vp.niotemporal  interpolation  of  a  visual  pattern. 
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Figure  7.  Temporal  (^t‘)  vs,  spatial  (Aj)  ofTset  in  Uic  compensation  crpcrimcnL  '1710 
ordinate  shows  the  temporal  olTsct  (in  equivalent  spatial  units  Ax'  =  v  -  fit  needed  to 
compensate  the  siiatial  offset  shown  in  the  ahscisa  •  is  for  a  .separaiioii  hciwcen  the 
station  Ax  =  '2.5'  and  a  velocity  i’  =  1.1  l°/i«r(A/  =  37trisir)  .V  is  foi  Ax  =  '2.5' 
and  i>  =  5.2^ O  is  for  Ax  =  7,5'  and  v  =  1.1  l°/Mr(AI  = 
ilOimiir)  larger  separations  yield  an  even  greater  mismatch.  The  continuous  diagonal 
indicates  the  loci  of  perfect  compensation  Subject  TV.  I  rom  Tahle  and  I’t'ggio  (1981) 

llic  elements  of  diis  scheme  could  be  interpreted  as  cells  with  asstKialed  receptive  fields  and  u  mporal 
impulse  responses.  Alternatively,  Fig.  8  represents  a  comptiunionol  scheme  for  spatimcmpor  tl  inter¬ 
polation.  Visual  input  is  stimplcd  in  space  by  an  array  of  cells  with  a  sampling  density  high  enough  to 
preserve  the  whole  of  the  spatial  information  (in  accordance  with  the  sampling  Uicorcm).  The  input 
is  then  reconstituted  in  more  dcuiil  on  a  finer  grid  of  cells  by  convolving  the  sampled  values  w  iih  the 
function  sine  x.  In  effect  each  cell  of  the  interpolation  layer  weights  its  inputs  according  to  a  centre 
surround  receptive  field.  A  variety  of  filters  (i.c.  “receptive  fields")  arc  capable  of  performing  a  correct 
interpolation,  especially  in  two  spatial  dimensions  (see  Crick  et  al.  1980). 


If  the  input  intensity  distribution  is  presented  at  discrete  instants  in  time,  temporal  intcriyulation 
can  be  achieved  by  suitable  temporal  low  pass  properties  of  each  individual  pathway.  If  tlic  temporal 
interval  between  presentations  is  small  enough  the  effect  of  the  filler  is  to  reconstruct  the  iiiiginal 
continuous  temporal  input.  Spatial  interpolation  can  then  operate  at  each  instant  of  time  (this  scheme 
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Figure  8.  (a)  A  siniplf  sihcmc  lor  'pnliolcinporal  iiiieii’olation.  lljc  inpul  paitcrn  is  sampled  by  an  array 
of  "cells  '  Spatial  mlcrpol.ilinii  i>  aa-omplishctl  i a  linci  imcipolaiion  grid  of  cells  each  one  weighting  the 
sampled  values  with  a  sme  slinpod  receptive  field  (.-hown  in  the  lower  nisei)  fcmporal  interpolation  is  obtained 
b)  filtcniig  with  an  appiopnato  low,  pass  or  hand  pass  tiller  c.ach  of  the  nipiii  cliaiiiieK  (its  impulse  response 
IS  shown  in  the  upper  inset)  lliiis  a  senes  of  diseiete  frames  of  a  moving  pattern  can  be  interpolated  (see 
Theorem  1  in  Appendix  into  a  cnniimious  lempoial  funciion  m  isich  ol  ihc  channels  Ihc  spatial  input 
dislribulion  oulliiicd  here  rcpresciils  an  mlciisily  laici  av  seen  bv  cenire  suiiound  ganplioii  cells  (b)  The  spatial 
interpolation  process  m  l  ouncr  space  Inlcnaolaiioii  is  ecpiivalcnl  to  lilienng  out  the  side  lobes  onginalcd  by 
the  sampling  process  Uniporal  niicipi>laiion  ran  bo  niieii'ieud  in  a  similar  wav  from  Table  and  Poggio 
(1981). 

would  of  course  operate  succc.srully  for  ctmiinuotis  movement  of  a  pattern). 

Fig.  8b  shows  the  Fourier  inlerpretaiion  oi'ihe  sp.itial  intcrptilatioii  prcKCSS  (interpolation  in  time 
can  be  interpreted  in  a  similar  wtiy).  I  he  cireei  of  v.tmpling  is  to  replicate  the  original  spectrum  in  an 
infinite  number  of  side  lobes.  Spaital  imcrpol.iiion  i  e.  rceonsiniction  of  the  original  function  from 
its  samples  -  is  accomplislicd  by  filtering  out  all  side  lobes  but  die  central  one  -  which  is  the  original 
spectrum. 


This  model  is  probably  the  simplest  conceivable  scheme.  In  it,  interpolation  in  space  and  time  are 
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performed  independently,  since  the  temporal  dependence  of  the  input  is  not  constrained  in  any  way. 
We  now  consider  tl\c  conditions  under  which  this  scheme  can  be  effective. 

3.1.2  Remarks  on  Interpolation 

Before  embarking  on  an  analysis  of  various  interpolation  schemes,  it  is  appropriate  to  make  a  few 
general  points  which  arise  from  tlic  discussion  so  far. 

KirsL  the  process  of  computing  intennediate  values  from  samples  docs  not  depend  on  the  existence 
of  a  finer  retinotopic  grid  of  “cells”,  where  the  results  arc  represented.  All  filtering  transformations 
indicated  in  I-ig.  8  could  be  carried  out  at  a  rather  symbolic  level  for  only  a  few  distinguished  points. 
Plus,  it  is  important  to  keep  separate  the  problem  of  a  process  from  die  problem  of  representing  its 
output.  This  paper  is  directly  concerned  only  with  die  first  issue. 

Second,  die  goal  of  the  interpolation  process  may  be  far  more  modest  tlian  a  full  rcconsiinn  tion  of 
the  input  di.stribution.  As  suggested  by  Crick  ct  al.  (1980),  die  aim  of  interpolating  the  ganglion  cells’ 
activity  is  to  provide  die  position  of  the  rcro-crossings  (where  activity  switches  from  the  on  centre 
to  the  off  centre  cells)  with  high  accuracy.  This  can  be  achieved  by  using  very  simple  imcrpoladon 
functions  such  as  a  normal  centre-surround  receptive  field  (Marr  et  al.,  1980). 

3.1.3  More  (ioinplcx  Interpolation  Schemes  arc  Required 

The  scheme  of  Fig.  8  can  provide  a  correct  reconstruction  of  a  spauotcmporal  input  sampled  al 
intervals  (in  space)  and  Ar  (in  time)  only  when  the  input  hinclion  is  bandlimitcd  in  spatial  (by 
and  temporal  (by  /^)  frequencies  in  such  a  way  that  Af  <  1/2/^  and  Ar  <  1  /2/J  (theorem  1  in 
Appendix  2).  Ihc  image  which  reaches  the  retina  is  indeed  bandlimited  in  spatial  frequencies  to  less 
than  about  60  cycles  per  degree  by  the  diffraction  limited  optics  of  the  eye.  Furthcnnorc,  a  temporal 
cutoff  is  imposed  al  the  level  of  die  photoreceptors  by  their  limited  temporal  resolution.  I'hc  scheme 
of  Fig.  8  can  therefore  correctly  reconstruct  an  image  sampled  at  intervals  of  less  dian  30"  in  space 
(for  the  2-1)  ease  sec  Crick  ct  al..  1980).  Temporal  samples  of  the  photoreceptor  activity  could  be 
interpolated  under  similar  conditions  (though  regu/m  tcmpor.il  sampling  in  our  visual  system  is  highly 
implausible). 
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Since  tJic  spacing  of  the  phoiorcccpKirs  is  almost  exactly  matched  to  the  eye's  optics,  interpolation 
in  normal  vision  -  when  tlie  image  is  a  continuous  function  of  time  and  space  -  can  be  accounted 
for  by  simple  schemes  like  iliat  of  Fig.  8.  In  particular,  such  models  could  account  for  the  vernier 
acuity  measured  with  real  coniimious  motion  of  the  retinal  image.  When,  however,  motion  of  an 
object  is  simulated  by  presenting  tJie  image  at  discrete  positions  at  separate  instants,  the  conditions  of 
theorem  1  arc  in  general  no  longer  satisfied.  In  our  experiments  wc  present  to  tlie  eye  an  image  which 
is  already  sampled  either  in  time  (Westheiincr  type  of  stimulus)  or  space  (Hiirr  type  of  stimulus)  or 
botli.  Wc  enforce  arbitr.try  sampling  intervals  Aa:  and  At  on  tlic  system  before  the  bandlimiting 
operations  of  tlie  eye's  optics  .ind  of  ilic  rcceiuor  kinetics  come  into  play.  Under  these  conditions 
the  input  function  g(z,  t)  is  not  ensured  to  be  appropriately  bandlimitcd  before  spatial  or  temporal 
Siimpling  tKcurs.  The  scheme  of  Fig.  8  shouUl  for  iiiMaui.e  perform  poorly  when  tlie  input  function 
is  sampled  in  space  at  intervals  A.r  significantly  coarser  tfvan  the  photoreceptor  array,  niirr's  and  our 
daui,  however,  show  tliat  under  these  conditions  our  visual  system  perfonns  significantly  better.  We 
arc  clearly  forced  therefore  to  consider  other  types  of  interpolation  schemes. 

3.2.1  I  he  Spatioteniporal  Spectrum  of  a  Moving  X'ernier 

Our  analysis  of  altcrn.itive  interpolation  schemes  begins  with  tlic  description  in  frequency  space  of 
tlie  physical  stimuli  corresponding  to  Westheimer's  and  Burr's  experimental  situations.  When  a  spatial 
pattern  g{x)  moves  continuously  at  constant  speed,  tlie  resulting  spatiotcmporal  distribution  of  excita¬ 
tion  on  the  retina  has  a  simple  representation  in  the  Fourier  space  of  temporal  (/()  and  spatial  (/,) 
frequencies.  Its  Fourier  transfomi  takes  v.ilucs  only  on  Uic  diagonal  line  shown  in  fig.  9a  witli  a  slope 
equal  to  the  velocity  (sec  Appendix  2).  1  or  each  spatial  frequency  contained  in  the  pattern,  there  is 
a  unique  temporal  frequency  corresponding  to  it.  Curtailing  the  duration  of  motion  (in  our  case  to 
T  —  ISOmaec)  spreads  the  1  ourier  transfonn  over  a  large  area  of  temporal  and  spatial  frequencies, 
changing  the  narrow  line  into  a  wider  area.  The  spread  (along  tlic  ft  axis)  is  tlie  same  for  all  our  data. 
Ihus  tlic  line  supporus  shown  in  fig.  9  must  be  iiitcrprcicd  as  being  spread  along  ft  as  a  sine  function. 
For  T  =  ISOmscc  the  width  of  the  spread  is  about  14  Hz  for  the  central  lobe  of  tlie  sine  function  and 
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28  Hz  for  the  central  lobe  plus  the  first  negative  side  lobe  on  both  sides.  The  retinal  stimulus  elicited 
by  continuous  motion  of  a  vernier  at  constant  velocity  can  be  described  in  this  way  (see  Appendix  2). 
T  he  upper  and  the  lower  segment  have  the  same  line  support  on  the  /,  —  ft  plane,  lltcir  Fourier 
transforms  differ  at  all  frequencies  only  by  a  phase  factor  which  mirrors  the  spatial  offset.  'ITic  correct 
detection  of  this  information  underlies  positional  acuity. 
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Figure  9.  Legend 

a)  llie  support  on  the  —  ft  plane  of  the  I'ourier  spectrum  associated  with  coiuiiumus  motion 
of  a  vernier  (see  inset)  at  constant  velocity  — o.  ITic  slope  of  the  line  is  v.  gifj,/,)  eqii  ils  g{f^) 
on  that  line.  Curtailing  tlte  duration  of  motion  to  T  —  150  msec.,  spreads  ilic  line  into  a  bar  like 
support,  corresponding  to  a  sine  function,  b)  1  he  support  of  the  l  oiiner  speclrimi  associated  with 
Wcstheiiner  s  type  of  experiment,  lltc  inset  indicates  that  displaying  the  vernier  stroboscopaalls  .it  a 
sequence  of  imes  with  an  interval  St  is  equivalent  to  "looking"  at  the  continuous  motion  of  a  vernier 
through  a  series  of  temporal  "slits".  This  has  the  effect  of  replicating  tlie  speeiriim  of  fig.  Li  .ikmg 
the  fi  axis  in  an  infinite  number  of  side  lobes.  1110  distance  of  the  lobes  on  /i  is  \/6t.  Ihc  line 
encounters  the  /j.  axis  at  l/v  ■  At  =  1/Ax  (if  Ax  =  1'.  tlie  distance  of  the  side  lobes  on  is 
(tO  cvclc/dcg).  Notice  that  for  any  /j.  each  lobe  supports  die  same  comple.x  I  ourier  spectri  ni 
c)  Ihe  support  of  tlte  Fourier  spectrum  assiKiated  with  Burr's  type  of  experiment.  DisiMaying  the 
line  segments  <'f  a  vernier  in  the  same  position  but  with  a  slight  delay  is  equivalent  to  looking  at  tlie 
continuous  motion  of  a  vernier  through  the  spatial  window  depicted  in  Lite  insei  (transparent  sliis  in 
.in  otherwise  opaque  screen.)  Ibis  corresponds  to  replicating  the  spectrum  of  fig.Sa  along  the  /,  axis. 
Ihe  distance  of  the  lobes  is  1/Ax.  where  Ax  is  the  interval  between  successive  slits  in  Uie  spatial 
window  .  At  a  given  /x.  the  Fourier  spectrum  jt(/x)  of  different  lobes  is  in  gcnor.il  dilTereni  d)  Ihe 
support  of  the  Fourici"  pcclrum  asswiated  witli  the  compensation  experiment  is  die  same  as  m  fig.Sc. 
Ihe  different  window  corresponding  to  this  sumulus  (see  inset)  corresponds,  however,  to  a  different 
complex  Fourier  spectrum  (see  Appendix  2).  From  huhlcand  F<iggio(1981). 
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Fig.  9  siiiiim.iri/os  (he  dest  i ipiit>n  ni  ihe  mo  ^tl^llJll)s  Loiifigijr.iiK'iis  used  in  this  paper 
according  lo  the  deinalii'n  outlined  h>  I  .lide  \  l\>ggio  ( Ihsl )  .  \\ esilicimer  s  experimental  situation 
is  equivalent  to  looking  at  the  toiuiiuious  iiuiiion  ol  a  M'iniei  ilirough  a  serie^  of  equidistant  n.irrow 
temporal  slits  within  which  ihe  iiaiicin  is  hnctl'.  visihle  (see  fig.9b).  Ikiir's  experimental  situation 
ideally  concspoiuls  to  a  \eiin  r  iiio-.iiig  h.  Iiiiid  .1  s[i.iiial  wind(iw  with  a  series  of  equidistant  narrow- 
slits  (see  fig.Tc).  The  spatial  o-  leinpor.il  wauiows  altc,!  diilcrciuK  the  spectrum  ol  the  retinal  input 
•As  indicated  in  tig.  m  the  Wostl'caiui  sit  i.uion  the  complex  spatial  spectrum  ol  tlie  pattern, 
which  contains  amplitude  and  (ihase  in  lonii.ition,  is  replic.iied  an  infinite  number  of  limes  along  the 
leinpt'ral  frequency  axis,  whereas  in  the  Hurr  c.ise  tJie  s.ime  spectrum  is  replicated  along  the  spatial 
frequency  axis.  An  imporiani  obscrvatuin  is  th.it  in  fig. 9b  (Wesihcimer  stimulus)  all  lobes  at  any- 
given  Js  support  ex.icih  the  same  compl  spc'cimm  g.  I  bis  is  not  so  in  lig./c  (Durr  stimulus),  where, 
insietid.  .ill  lobes  h;ivo  the  s.ime  g  at  .my  given  We  rc-emphasi/e  ih.it  fig.  9  describes  the  physical 
properties  of  the  dill'erent  stimuli  w  iihout  any  reference  to  tlic  human  visual  system. 

.1.2.2  Computational  Aspects  of  Interpolation;  I  he  Constant  Velocity  Assumption 

More  cITcctivc  interpolation  schemes  are  fe.isilvlc  if  general  constraints  about  the  nature  of  the  visual 
input  arc  incorporated  directly  in  the  ciMiipm.uion.  llic  key  observation  here  is  that  tlic  temporal 
dependence  of  tlic  visual  inpui  is  tisually  due  to  nuncmenl  of  rigid  objects,  and  that  in  everyday  life 
motion  has  a  nearly  eonsiani  velocity  over  the  times  and  distances  which  arc  relevant  to  Uie  interpola¬ 
tion  priKcss  (T  a  lOOrnaec  and  x  <.  C).  I  hc  comiont  relociiy  assumption  leads  to  a  more  specific 
form  of  tlic  sampling  theorem,  given  in  Appendix  2  (sec  also  Crick  ct  al..  1980),  which  states  formally 
what  is  intuitively  clear;  tlic  spalioicmpoiat  sampling  rate  can  become  very  low  without  losing  infor¬ 
mation.  Interpolation  schemes  lia.scd  on  tlic  constant  velocity  tassumption  exploit  the  equivalence  of 
the  time  and  space  variable  {x  (=a  vl).  From  the  point  of  view  of  filtering  this  means  that  spatial 
and  temporal  interpolation  cannot  be  performed  independently  as  in  the  simple  scheme  of  Fig.  8.  In 
the  Fourier  domain  tlic  constant  velocity  assumption  constrains  the  spectrum  of  the  visual  input  to 
lie  on  the  line  support  shown  in  l  ig.  9a.  In  the  ideal  case  of  infinitely  long  motion  the  side  lobes 
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generated  by  s.impiing  cithc:  in  lime  (Kig.  9b)  or  space  (Fig.  9c)  can  always  be  excluded  by  means 
of  appiopriate  filters,  if  the  precise  value  of  t;  is  known  (e.g.  by  measurements).  Ilie  recovery  of 
the  original  spectnim  (l-'ig.  9a)  corresponds  to  an  ideal  interpolation  fur  arbitrarily  large  sampling 
intervals  (if  o  is  known  and  dilTereni  from  zero).  In  die  realistic  case  of  finite  duration  of  motion,  finite 
sampling  intervals  are  enforced  by  the  spread  of  the  I  'ourier  spectrum  into  a  larger  are.i.  but  the  same 
basic  arguments  still  apply, 

3.2.3  Inipleiiientiiig  the  constant  velocity  scheme 

.^n  interpolation  scheme  of  tliis  type  could  be  implemented  simply  by  measuring  lire  exact  velocity 
of  mov  ement  .ind  then  rect'iistructing  the  spatiotemporal  trajectory  of  the  pattern  foi  eitlicr  temporal 
or  spatial  inform.ition.  Another,  more  attractive  possibility  is  suggested  by  the  idea,  supported  by 
nnich  ps.vebopbvsical  evidence,  dial  in  the  human  visual  syslcin  there  exist  several  cb.miiels  at  each 
ceceniiicuy  ,  i.e.  sevcra'i  sets  of  receptive  fields  tuned  to  different  spatial  sizes  .md  with  different 
tempor.il  pioperties.  SVe  imagine,  following  llurr  (1979b)  that  these  channels  have  si'inewhal  overlap¬ 
ping  supports  covering  the  region  of  die  (Jt  —  ft)  )-'ourier  plane  which  corresponds  to  the  sensitive 
range  of  the  visual  system.  '‘Stasis"  channels  arc  tuned  to  high  spatial  frequencies  (small  receptive 
fields)  and  low  temporal  frequencies  (sustained  properties);  “motion"  channels  arc  tuned  to  low  spa¬ 
tial  frequencies  (large  receptive  fields)  and  high  temporal  frequencies  (transient  properties).  Ihus. 
each  channel  is  tuned  to  a  different  range  of  velocities,  centred  on  die  ratio  between  the  optimal 
temporal  and  spatial  frequencies  characteristic  for  the  channel:  stasis  channels  for  instance  are  tuned 
to  low  velocities  whereas  motion  channels  arc  tuned  to  high  velocities.  Fig. 10b  shows  a  set  of  ideal¬ 
ized  "vcliKitv  channels"  of  this  type.  Since  each  channel  has  its  own  cutoff  in  temporal  and  spatial 
frequency,  interpolation  may  be  performed  mdependendy  and  with  dilferent  characteristics  within 
each  channel.  In  the  Burr  type  of  experiment  stasis  channels  could  correctly  interpolate  only  patterns 
displayed  at  small  separations  and  low  vcliKitics,  whereas  motion  channels  could  be  effectiv  e  (but  not 
so  accurate)  at  large  separations  and  high  velocities  by  filtering  out  the  side  lobes  arising  from  the 
coarse  spatial  sampling.  I'hc  complementary  argument  applies  for  coarse  time  sampling.  As  indicated 
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in  l-ig.  10b  the  suisis  channels  may  sutler  from  aliasing  at  values  of  Ar  for  which  the  motion  channels 
intcri)olate  correctly.  We  assume,  then,  that  in  this  scheme  the  wrong  clianncis  are  switched  off  by  use 
of  velocity  information. 

l-'ig.  10c  shows  a  more  rc.ilistic  interpolation  scheme  of  the  same  basic  type.  Instead  of  many 
channels,  e.icli  one  sh.trply  tuned  to  velocity  and  inactivated  when  the  pattern  does  not  move  at  its 
characteristic  velocity,. tliere  are  a  few  channels  coarsely  tuned  to  velocity  and  without  any  precise 
velocity  sensitive  in.ictivaiiun,  .ipart  from  directional  selective  properties. 

In  die  light  of  this  analysis  we  turn  now  to  a  detailed  discussion  of  our  experiments.  Our  main 
question  concerns  ofeourse  which  type  ofinterpolaiion  scheme  is  actually  used  by  our  visual  system. 


4.1  NNestheiiner's  Vciiilv  ;  llecoverv  of  Spatial  Offset 

a)  In  I  'ouricr  terms,  the  aim  of  the  interpolation  process  is  to  filter  out  the  side  lobes,  preserving  only 
the  central  lobe,  as  die  latter  represents  die  Fourier  spectrum  of  a  continuously  moving  bar. 

When  both  the  time  interv.il  At  between  presenutions  and  uk  vcliKity  v  are  small,  interlacing 
of  die  side  lobes  in  die  Fourier  spectrum  is  negligible,  i  cmporal  low  pass  properties  of  the  visual 
pathway,  as  in  the  model  of  fig.  10a.  suffice  for  cliininaiing  die  side  lobes  and  thus  achieve  a  correct 
interpolation.  When  At  is  l.trgc,  however,  interlacing  is  considerable  in  the  sense  that,  even  for  the 
scheme  of  fig.  10c.  there  arc  one  or  more  channels  which  mix  the  main  lobe  with  at  least  one  of  the 
side  lobes.  Because  of  the  spread  asstKiated  with  the  shun  duration  of  the  motion  sequence,  actual 
overlap  between  the  lobes  can  be  significant.  It  turns  out.  however,  diat  this  docs  not  represent  a 
problem  from  the  point  of  view  of  the  spatial  acuity  measured  in  our  experiments.  At  each  /*  the 
complex  Fourier  spectrum  on  all  side  lobes  is  exactly  the  same.  ITius,  the  spatial  spectrum  is  correct 
irrespectively  of  the  tempora}  frequency  and  independently  of  the  number  of  side  lobes  contained 
in  the  suppon  of  the  interpolation  filters.  At  large  Ax  and  high  v,  tiic  presence  of  the  side  lobes 
turns  out  to  be  even  beneficial  for  vernier  acuity;  under  these  conditions  high  frequency  channels. 
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Figure  10.  (a)  The  suppon  on  ihc  Touricr  plane  of  spatial  and  temporal  frequencies  of  an  mlcn'olaUon 
filter  corres|H)ndmg  to  a  scheme  such  as  Tig  h  (h)  lltc  support  on  the  I  ouncr  plan  of  a  set  of  spatioicmporal 
filters  idep.ily  tuned  to  dilTercnt  velocities  A  large  number  is  needed  to  rover  all  selociiies  of  interest  The 
fillers  arc  assumed  to  be  direction  selective,  since  they  only  operate  in  the  Touncr  quadrants  corresponding 
10  positive  r  =  /i//i  in  g(.r  -F  vl)  A  spatial  pattern  moving  at  constant  velocity  and  sampled  at  si'alial 
intervals  Aj-  has  on  this  plane  the  support  shown  by  fig  9c  To  avoid  aliasing,  the  low  vclociiy  fillers  can 
be  "swiiched  olT'  by  infomtalion  aboul  the  velocity  of  the  motion  <c)  A  more  rcalisiie  set  of  filters,  broadly 
tuned  lo  dillercnt  velocities.  Ihc  stasis  channel  is  tuned  to  low  temporal  and  high  spatial  frequencies  and 
thus  to  low  velocities  The  motion  channel  is  tuned  to  high  temporal  and  low  spatial  frequencies  and  thus 
to  high  yelmilics  Intermediate  channels  (not  shown  here)  may  al.so  be  present  The  hatched  areas  represent 
Uic  suppon  of  such  directional  filters.  Nondircctional  filters  would  hayc  also  a  symmetne  suppon  in  the  other 
two  quadrants  T'rom  Fahlc  and  Poggio  (1981). 

which  wunild  not  be  stimulated  by  continuous  motion,  can  obtain  the  correct  spatial  information  from 
the  side  lobes,  which  arc  an  ancfact  of  the  discrete  time  presentations.  On  the  wlmlc,  and  in  the 
absence  ol  a  sophisticated  interpolation  process  that  always  excludes  all  side  lobes  (such  as  the  scheme 
of  fig.  10b),  one  expects  vernier  acuity  to  be  rather  invariant  for  a  wide  range  of  separations  and 
velocities.  Our  data  conform  well  to  these  expectations.  Notice  that  the  presence  of  side  lobes  at  high 
velocities  and  large  separations  corresponds  to  the  perception  not  of  a  moving  bar  but  of  a  briefly 
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illuminated  stationary  grating  -  which  carries  however  the  correct  spatial  information.  In  this  sense 
at  large  Ai  and  high  v  interpolation  fails  to  retrieve  tlie  "correct"  spatiotemporal  pattern,  but  still 
preserves  spatial  .icuity  (even  .it  extrcnicl>  high  speeds). 

b)  'Hie  qualitative  interpretation  of  our  data  in  usual  space-time  variables  is  straightforward.  Spatial 
interpolation,  for  instance  by  appuipriate  receptive  fields,  takes  place  correctly  for  each  frame  (i.e.  for 
each  station  )  oven  when  tempor.il  interpolation  fails.  Since  our  forced  choice  Lisk  measure’s  only  spa¬ 
tial  acuity,  performance  is  in  this  case  independent  of  tlic  interpolaaon  of  tlic  temporal  dependence  of 
the  visual  input. 

c)  These  results  suggest  that  sp.itiotcinporal  interpolation  is  not  performed  by  the  "ideal"  interpola¬ 
tion  scheme  of  l-'ig.  lOh.  I  'or  lemporal  aspects  should  tlien  be  retrieved  correctly  at  all  At,  while 
acuity  for  high  velocities  should  be  cx.ictly  as  bad  as  for  continuous  motion.  ITie  one  channel  scheme 
of  I'ig.  10a  could  explain  tlicsc  data  on  positional  acuity:  but  as  pointed  out  by  Durr  (197%.  1980)  the 
image  should  tJicn  be  inc\  itably  smeared  at  all  but  very  low  velocities. 

4.2  Burr’s  Acuity;  Interpolation  of  l  emporal  Oflset 

a)  In  Burr's  cxpc-imciit  tire  situation  is  quite  different.  For  any  given  the  side  lobes  contain 
different  pans  of  the  original  spectrum.  I'hus  when  more  side  lobes  lie  in  the  support  of  the  same 
channel  (in  fig.  10a  or  fig.lOe)  there  is  a  mixture  of  spatial  frequencies,  dcuimcntal  to  acuity.  One 
understands,  therefore,  dial  acuity  deteriorates  considerably  (sec  fig.  2)  with  increasing  overlap  among 
the  side  lobes  (large  separations  between  the  stations).  At  any  given  (large)  separation,  low  velocities 
bring  about  a  considerable  overlap  between  Uic  side  lobes.  Higher  velocities  reduce  the  degree  of 
overlap  at  the  expense  of  high  spatial  frequency  infoimation,  which  is  filtered  out  by  the  temporal 
cutofUs)  of  the  visual  pathway  (between  20  and  50  H/.  see  for  instance  Kelly,  1979).  Thus  one 
expects  to  find  for  each  separation  Ar.  an  optimal  velocity  at  which  the  side  lobes  just  avoid  overlap. 
Assuming  a  spread  of  15I1z  the  optimal  velocity  (in  dcgrce/scc)  should  be  v  =  30  •  A*  (Ai  in 
degrees),  which  is  in  rough  agreement  with  the  data  of  fig.  5b.  When  tlte  velocity  approaches  zero  the 
line  supports  in  fig.  10c  all  tend  to  lie  on  the  fj  axis  (notice  that,  because  of  the  finite  presentation  time 
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T,  the  supports  effectively  overlap).  In  this  situation  information  about  the  offset  cannot  be  retrieved. 
In  the  limit  of  very  high  velocity  the  set  of  lobes  approaches  the  line  spectrum  of  a  stationary  grating 
witlt  no  offset.  Notice  that  we  assume  for  the  scheme  of  fig  10c  tlwt  Uic  vernier  tJtreshold  is  higher 
when  some  of  tlic  channels  signtil  zero  offset  while  the  others  still  “see"  tlie  correct  offset. 

b)  When  the  temporal  component  of  the  filters  fails  to  interpolate  between  temporal  frames  motion 
is  perceived  as  discontinuous.  As  a  consequence  the  spatial  inter|tolatiun  process  correctly  signals  zero 
spatial  offset  for  each  frame,  llie  critical  strobe  interval  which  yields  optimal  temporal  interpolation 
is  nut  very  different  between  the  channels  (see  Fig.  5a).  1  hough  its  performance  may  worsen  at 
high  velocities,  as  for  the  continuous  motion,  it  should  be  rather  invariant  with  respect  to  Ax,  the 
separation  between  the  stauons.  Fig.  5a  shows  that  this  diK'S  not  happen.  Ihc  opposite  conclusion 
holds  for  the  scheme  of  Fig.  10a.  Its  perfomiancc  should  deteriorate  rapidly  firr  separations  Ax 
between  Ute  sUttions  larger  than  tlie  distance  between  photoreceptors,  which  is  in  conflict  with  Burr's 
and  our  dau.  An  interpolation  scheme  of  the  type  of  Fig.  10c  seems  eonsisient  with  Uiesc  results; 
while  smali.  slow  "receptive  fields"  would  be  unable  to  interpolate  correctly  at  large  separations  (Ai 
large),  fast  receptive  fields  could  perform  a  correct  interpolation,  if  the  velocity  is  .ippropriate. 

I'hc  fact  tliat  spatial  acuity  is  extremely  good  at  separations  up  to  2.5'  suggests  tJiat  tlie  intcrriolation 
channels  arc  direction  selective. 

4..^  Kffcct  of  Blur 

a)  The  interpolation  scheme  outlined  in  fig.l0c  makes  a  rather  .strong  prediction  about  the  effect 
of  blur.  In  the  Wcslhcimer  ease  blur  can  only  degrade  vernier  acuity,  since  it  eliminates  the  high 
frequency  channels.  Blur  of  the  Burr  stimulus,  however,  should  improve  acuity  at  least  at  large  separa¬ 
tions  and  high  velocities,  since  it  eliminates  side  lobes  which  signal  tlie  absence  of  an  offset .  Our  data 
arc  fully  ci  nsistcnt  with  this  expectation.  A  more  perceptual  but  equivalent  description  of  the  effect 
of  blur  is  tliis.  At  high  velocilcs  and  large  separations  there  is  a  strong  sensation  of  a  grating  of  thin, 
unbroken  lines  corresponding  to  the  side  lobes  seen  by  visual  mechanisms  tuned  to  low  temporal 
and  high  spatial  frequencies  -  and  a  weak  impression  of  a  single  moving  target  with  a  clear  oflset 
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-  corresponding  to  the  main  lobe  seen  by  mechanisms  tuned  to  lower  spatial  and  higher  temporal 
frequencies.  Ihis  .ambiguity  is  removed,  as  already  noticed  by  Burr  (1979),  by  the  blur  of  the  screen, 
which  suppresses  die  high  frequency  grating. 

b)  In  other  terms,  blur  eliminates  the  contribution  of  the  small  receptive  fields  which  arc  unable 
to  interpolate  correctly  at  large  separations  and  therefore  signal  zero  offset.  ITie  large  receptive  fields, 
however,  remain  largely  unaffected  by  blur. 

c)  I  he  effectiveness  of  blur  in  improving  vernier  acuity  at  large  Ai  shows  that  our  visual  system 
d(Ks  not  normally  have  the  intrinsic  possibility  of  switching  off  the  wrong  channels  as  assumed  in  the 
scheme  of  Fig.  10b. 

4.4  .Spatial  v.s.  I'cniporal  Compensation 

a)  Ihis  stimulus  situation  corresponds  to  looking  at  the  continuous  motion  of  a  vernier  through  the 
spatial  window  shown  in  the  inset  of  fig.  9d.  The  resulting  Fourier  support,  is  again  as  in  fig.  9c: 
here,  however,  the  main  lobe  signals  no  offset,  corresponding  to  precise  spatiotcmporal  compensa¬ 
tion,  whereas  tlic  other  lobes  all  signal  the  spatial  offset  between  the  upper  and  lower  grating  of 
the  window.  In  other  words,  exact  compensation  between  space  and  time  is  realized  only  in  the 
main,  correct  lobe.  Thus,  the  spatial  offset  should  dominate  as  soon  as  the  side  lobes  are  “seen" 
by  some  of  the  channels  of  fig.  10c.  This  is  increasingly  so  for  larger  separations  Ax  between  the 
stations.  Correspondingly,  the  perception  of  the  stationary  grating  carrying  spatial  offset  information 
(the  broken  slits  in  the  window  of  fig.  9d)  is  expected  to  dominate  at  large  separations  and  velocities. 
Again  our  data  are  consistent  with  these  expectations.  Even  at  relatively  small  separations  between 
the  stations  (see  fig.  7)  the  system  docs  not  achieve  a  perfect  interpolation  -  that  is,  removal  of  all 
side  lobes.  Only  in  this  case  would  the  temporal  offset  exactly  cancel  the  spatial  offset.  As  expected, 
blur  improves  compensation,  since  it  helps  to  remove  the  “wrong"  side  lobes,  which  carry  information 
only  about  the  spatial  offset. 

b)  Phis  experiment  combines  Burr  and  Westheimer  stimuli.  Since  spatial  interpolation  always 
retrieves  the  spatial  offset,  this  dominates  for  all  cases  in  which  the  temporal  component  of  intcrpola- 
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tiun  is  not  fully  correct. 

5.  Discussion 

To  summari/c.  the  psychophysical  experiments  reported  here  suggest  that  spatiotemporal  interpola¬ 
tion  in  the  visual  system,  remarkable  though  it  is.  is  far  from  being  perfect  and  flawless.  Ideal 
interpolation  is  equivalent  to  Altering  out  the  side  lobes  in  the  Fourier  spectrum  arising  from  the 
discrete  presentations.  The  task  is  easy  at  sniall  separations  but  requires  in  principle  complex  filters  for 
large  separations  (sec  Crick  et  al.,  1980).  As  our  data  suggest,  our  visual  systems  do  not  seem  to  use 
a  very  sophisticated  spatiotemporal  interpolation  process.  The  side  lobes  arc  not  eflcctivcly  filtered 
out  under  all  conditions.  Spatiotemporal  interpolation,  then,  can  be  considered  as  a  direct  conse¬ 
quence  of  ihe  spatial  and  temporal  properties  of  early  vision,  in  terms  of  an  interpolation  scheme  of 
the  type  of  fig.lOc.  'Die  existence  of  independent  channels  tuned  to  different  spatial  and  temporal 
frequencies  seems  to  account  for  the  spatiotemporal  interpolation  revealed  by  our  experiments.  A 
detailed  theoretical  analysis  with  the  help  of  appropriate  computer  experiments  is  necessary  for  a 
quantitative  evaluation  of  interpolation  models  of  this  type. 

5.1  F.xplicit  or  implicit  interpolation? 

Interpolation  can  be  regarded  as  a  spatiotemporal  Altering  of  the  input  transmitted  from  the  retina. 
This  is  the  point  of  view  taken  in  this  paper.  We  cannot  advance  any  hypothesis  as  to  where  tliis 
filtering  stage  may  be  localized  in  the  brain  on  the  basis  of  our  psychophysical  data  alone.  Throughout 
this  paper  we  have  used  the  term  “interpolation"  without  necessarily  implying  a  direct  reconstruction 
of  the  pattern  of  visual  activity,  say  its  zero-crossing  profile  in  the  various  channels,  somewhere  in  the 
visual  pathway.  Clearly,  hypcracuity  may  simply  rely  on  a  specialized  routine  operating  on  a  small 
region  of  the  image  to  answer  specific  questions,  like  the  right-left  choice  in  a  vernier  task.  Thus 
the  interpolation  scheme  suggested  by  our  data  may  be  implemented  as  an  “implicit  interpolation", 
that  is,  as  a  computational  process  involving  manipulation  of  symbolic  quantities;  or  it  may  depend 
on  an  "explicit  reconstruction"  of  a  (coded)  version  of  the  array  of  photoreceptor  activity  on  a  fine 
retinotopic  grid  of  neurons.  These  extreme  possibilities  -  and  all  in  between  -  can  be  implemented  in  a 


L 


PNN 


31 


SPATIOTEMPORAL  INTERPOLATION 


variety  of  ways.  For  instance,  activity  may  be  reconstructed  automatically  on  the  fine  topographic  grid 
of  layer  IVc/J  by  an  automatic,  parallel  process. 

On  the  other  hand,  a  specific,  more  symbolic  process  could  read  the  output  of  retinal  ganglion  cells 
and  perform  the  correct  interpolation  for  any  desired  position  and  time.  In  this  case  interpolation 
would  be  implicit  and  mixed  with  the  decision  process  itself. 

In  the  first  case,  the  decision  routine  (is  die  upper  segment  to  the  right  or  to  the  left?)  would 
operate  on  an  interpolated  version  of  the  image.  Thus,  “reprogramming"  of  Utc  vernier  routine  may 
not  be  expected  to  affect  the  interpolation  process  but  only  the  detection  criteria,  contrary  to  the 
second  case,  in  which  different  detection  strategics  may  influence  interpolation. 

5.2  Are  the  Psychophysical  Channels  the  Interpolation  Kilters? 

Our  data  support  interpolation  schemes  of  the  type  outlined  in  Fig.  10c.  They  say,  however,  neither 
how  many  independent  channels  are  needed,  nor  what  arc  exactly  their  spatiotemporal  properties. 
Our  results  seem  consistent  with  standard  characterizations  of  their  spatial  and  temporal  properties 
(Campbell  and  Robson.  1968;  Burr,  1979b;  see  also  Marr  et  al..  1980;  Wilson  and  Gieze,  1977,  Wilson 
and  Bergen,  1979). 

These  observations  suggest  the  interesting  idea  that  the  spatial  frequency  tuned  channels  present 
in  early  human  vision  may  be  the  interpolation  filters  themselves.  To  be  completely  explicit  let  us 
consider  simple  examples  of  how  an  interpolation  scheme  such  as  Fig.  10c  might  be  implemented 
in  the  visual  system.  The  first  possibility  is  that  the  image  is  filtered  before  interpolation  through 
various  independent  channels.  Retinal  or  LGN  ganglion  cells  of  different  sizes  could  represent  the 
image  filtered  at  different  resolutions.  Later  in  the  visual  pathway  each  of  these  representations  would 
be  independently  interpolated  on  a  finer  cortical  grid  of  cells  with  a  receptive  field  very  similar  to 
the  corresponding  LGN  cells.  Another  possibility  is  that  only  two  of  the  channels  are  present  at  the 
precortical  level  (e.g.  X  and  Y)  and  that  the  measured  psychophysical  channels  represent  interpolation 
filters  operating  on  their  X  and  Y  input  at  the  cortical  level.  In  this  second  case  one  would  expect  only 
two  sizes  of  receptive  fields  -  at  each  eccentricity  -  in  the  retina  and  LGN  but  a  scatter  of  sizes  in  the 
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cortex  (possibly  in  IVc).  'Hius  the  same  retinal  channel  may  be  interpolated  in  two  different  ways,  by 
small  cortical  receptive  fields  and  by  large  ones,  the  first  reconstructing  the  high  frequency  content 
of  the  retinal  channel  and  the  second  emphasizing  its  coarser  details.  Notice  that  as  a  consequence 
cortical  (interpolation)  channels  may  have  a  narrower  bandwidth  than  retinal  ones. 

5.3  A  prediction:  interpolation  must  be  direction  selective 

An  explicit  intcipolation  scheme  of  this  type  consists  of  a  set  of  motion  channels  with  direction  selec¬ 
tive  properties,  in  the  sense  that  the  spatiotcmporal  interpolation  filter  tliercby  implemented  must 
depend  (in  one  dimension)  on  the  sign  of  v  (see  appendix  of  Fahic  and  Poggio.  1981).  As  a  conse¬ 
quence  the  interpolation  channels  should  have  some  type  of  direction  selective  property;  furtlicrmore, 
cells  of  layer  IVc  -if  they  arc  involved  at  all  -  should  show,  despite  their  cenlcr-surround  receptive 
field,  some  non-standard  direction  selective  property. 


6.  Interpolation  in  the  pcrifovcal  visual  field:  docs  aliasing  occur? 

In  the  perifoveal  retina,  the  spacing  of  the  ganglion  cells  increases,  as  Barlow  pointed  out.  whereas 
the  optical  cut-off  remains  approximately  the  same  (for  instance  at  IIP  eccentricity:  see  Weale,  1976). 
The  grid  of  ganglion  cells  is,  however,  matched  to  the  spatial  cut-off  of  the  signal  thereby  represented: 
in  the  cat,  Pcichl  and  Wassle  (1979)  have  shown  that  receptive  field  diameter  and  ganglion  cell  separa¬ 
tion  both  increase  towards  the  periphery  so  that  sampling  in  the  array  of  ganglion  cells  lakes  place  at 
the  interval  appropriate  to  the  cut-off  frequency  passed  by  the  larger  receptive  fields,  fhus,  the  grid  of 
ganglion  cells  is  likely  to  satisfy  the  sampling  theorem  (see  Hughes.  1981). 

A  more  serious,  and  so  far  unsolved,  problem  is  whether  in  the  perifoveal  visual  field  the  signal 
represented  by  the  ganglion  cells  suffers  from  aliasing,  i,e.,  undersampling,  at  the  level  of  the 
photoreceptors.  If  only  cones  are  involved,  aliasing  seems  unavoidable  for  eccentricities  larger  than 
about  y  -  IIP.  The  classical  sampling  Uieorem  requires  that  the  signal  is  lowpass  filtered  before 
sampling  in  order  to  avoid  overlap  of  the  sidelobes  in  the  Fourier  spectrum  (i.c.,  aliasing),  l  owpass 
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filtering  after  sampling  cannot  always  avoid  aliasing. 

It  is  easy  to  show  that  ideal  lowpass  filtering  after  sampling  eliminates  overlap  of  the  sidelobes 
only  up  to  sampling  intervals  that  are  twice  the  limit  set  by  the  sampling  theorem.^  Preliminary  com¬ 
puter  experiments  support  these  conclusions  for  the  approximately  lowpass  filtering  performed  by  a 
center-surround  receptive  field;  in  this  case,  however,  effectiveness  of  lowpass  filtering  decreases  more 
gradually  with  increasing  .sampling  intervals.. 

Itiis  scheme  is  somewhat  supported  by  Poliak's  data  showing  that  visual  acuity  threshold  increase 
with  eccentricity  more  than  the  separation  between  cones.  Convergence  of  cones  on  X  ganglion  cells 
is  therefore  likely  to  increase  with  eccentricity. 

If  aliasing  cannot  be  fully  avoided,  hyperacuity  threshold  must  rise  faster  with  eccentricity  than 
visual  resolution  thresholds,  a  result  which  has  been  recently  established  by  Westheimer  ( 1982).  If  the 
reason  for  this  were  indeed  aliasing,  blur  of  the  vernier  pattern  should  improve  vernier  acuity  in  the 
periphery,  at  least  in  the  absence  of  noise.  Blur  of  the  pattern  corresponds  to  lowpass  filtering  of  the 
signal  before  sampling,  as  required  by  tlie  sampling  theorem.  Preliminary  experiments  performed  to 
test  this  prediction  indicate,  however,  that  blur  may  improve  hyperacuity  only  slightly,  if  at  all  (Fahle 
and  Poggio,  1981;  Westheimer,  pers.  comm.;  Fahle,  pers.  comm.). 

A  possible  explanation  for  this  small  effect  arises,  if  input  from  rods  (in  addition  to  cones)  is  also 
allowed.  Aliasing  in  the  periphery  could  then  be  largely  avoided  at  all  eccentricities  by  lowpass 
filtering  the  image  before  sampling,  by  pooling  together  inputs  from  all  neighboring  photorcceptors- 
rods  and  cones-v/j  either  gap  junctions  or  synaptic  coupling  in  second  order  neurons.  If  this  predic¬ 
tion  were  correct,  the  decrease  of  vernier  acuity  with  eccentricity  would  not  depend  on  aliasing  but 
would  simply  be  a  graded  phenomenon  due  to  the  increasing  spacing  (in  terms  of  visual  angle)  of 
the  cortical  grid  and  on  a  decreasing  signal  to  noise  ratio  (because  of  the  decreasing  density  of  cells), 
llie  ineffectiveness  of  blur  is  consistent  with  this  scheme.  A  critical  test  of  this  hypothesis  may  be 

^ This  is  achieved  at  the  expense  of  a  much  more  extensive  fore  of  htsh  ^Mtial  ftequencies  than  in  the  cue  of  lowpav 
filtering  bejbre  sampling.  localization  of  an  isolated  feature  like  a  zero-aossing  is,  hovrever,  rather  unaffected  by  loa 
of  high  spatial  frequencies,  in  the  ideal  case  of  small  noise  level. 
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obtiiincd  by  measuring  vernier  acuity  in  the  periphery  under  difTerent  conditions  of  light  adaptation. 
An  important  corollary  of  this  prediction  is  that  the  space  constant  of  the  electrical  coupling  should 
increase  proportionally  to  cone  spacing  from  the  fovea  to  the  periphery  (tlic  rod  network  may  have 
interesting  spatiotcmporal  properties  (see  l>:twilcr  ct  al..  1978),  possibly  useful  for  moving  patterns). 
Several  morphological  studies  have  demonstrated  apparent  connections  between  cones  as  w  ell  as  be¬ 
tween  rods  and  cones  in  the  vertebrate  retina  (see  for  instance  Raviola  and  Gilula.  1975).  Nelson 
(1977)  has  provided  physiological  evidence  for  the  cat  that  cones  have  inputs  from  rods,  probably 
mediated  by  the  rod-cone  gap  junctions.  The  above  conjecture  would  explain  why  coupling  of  this 
type  is  needed  already  at  the  Icrcl  of  the  photoreceptors,  whereas  improvement  of  signal-io-noise 
ratio  could  be  achieved  in  a  simpler  way  with  convergence  of  signals  at  a  later  level  in  the  retina. 

6.1  Significance  for  information  processing  and  machine  vision 

I  hcrc  arc  various  methods  for  rcconstnicting  the  origimil  signal  at  high  resolution  by  imenrolating 
values  measured  at  widely  spaced  intervals.  I'hc  best  known  approach  to  tliis  problem  is  based  on 
tire  Shannon  sampling  theorem  and  on  its  various  extensions.  For  static  images  intcipolation  of  tliis 
type  can  provide  a  resolution  much  higher  than  the  original  sampling  grid.  Since  in  our  framework 
the  position  of  zero-crossings  (and  not  the  grey  level  values)  is  important.  Hildreth  and  Fogg'io  have 
examined  the  problem  of  interpolating  the  values  of  the  V^G  convolution  in  order  to  obtain  precisely 
the  location  of  zero-crossings.  Analytical  arguments,  supported  by  computer  experiments,  have  shown 
tltat  the  position  of  a  zero-crossing  can  be  interpolated  precisely  in  terms  of  very'  simple  interpolation 
functions,  even  by  linear  interpolation.  For  time-varying  images  the  situation  is  more  complicated.  In 
tlic  classical  sampling  theorem,  interpolations  in  space  and  time  arc  performed  independently,  since 
die  icmpoial  dependence  of  the  input  is  not  constrained  in  any  way.  Interpolation  algoritlims  based 
on  the  constant  velocity  assumption  discussed  earlier  could  achieve  higher  spatio-temporal  resolution 
for  objects  in  motion,  as  long  as  the  constant  velocity  assumption  is  not  grossly  incorrect,  despite 
low  spatial  and  temporal  sampling  rates.  Positional  acuity  for  the  image  features,  e.g.,  die  zero- 
crossings.  although  desirable,  is  not  the  only  goal  of  this  spadotcmporal  interpolation  stage.  A  filter 
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that  correctly  interpolates  the  sampled  image  automatically  avoids  any  defect  in  the  representation  of 
the  image  since  it  reconstructs  the  “original"  input  It  avoids  in  particular  motion  smear;  and  it  “fills 
in"  eventual  gaps  either  in  space  or  time,  where  or  when  the  sampled  input  is  missing.  Real  time 
vision  machines  may  well  need  such  an  it  erpolation  stage  and  it  will  be  interesting  to  sec  the  form 
and  the  performance  of  a  computer  implementation.  In  particular,  the  "gap  junction"  scheme  for 
avoiding  aliasing  with  sparse  sampling  intervals  may  be  usefully  implemented  in  future  CCD  devices. 
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Appendix  la 

l.ogan’s  results  apply  to  functions.  i.c.,  the  restrictions  to  the  real  line  of  entire  functions 

of  exponential  type  \  whose  growth  (on  (R))  is  less  than  exponential.  In  particular,  they  apply  to  pe¬ 
riodic  functions  with  the  exception  of  theorem  4  (l.ogan,  1977),  which  can  be  specialized  to  periodic 
functions  (l.ogan.  personal  communication).  If  we  restrict  ourselves  to  trigonometric  polynomials,  it  is 
possible  to  illustrate  1  .ogan's  results  in  a  simple  way.  It  should  be  stressed,  however,  that  trigonometric 
polynomials  arc  a  very  special  ease  and  in  general  erroneous  inferences  can  be  made  from  their 
special  properties.  With  this  “caveat"  in  mind,  let  us  consider  the  real  band  limited  function 

N 

=  =  (1) 
—  N 

w  hich  can  fic  extended  to  the  complex  plane  as 

N 

— N 

h(z)  is  for  instance  bandpass  with  one  octave  bandwidth  if 

C„  0  Inl  <  y 

The  complex  free  zeros  of  h[z)  are  the  complex  zeros  of  h[z)  in  common  with  its  Hilbert  transform 
h{z)  where 

==  «i9n{n)Cn  (2) 

— N 


Let  us  define,  given  h(z) 
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N 

P{z}=^C„e'”> 

N{z)=  X]  (3) 

—N 

where  i4  is  the  low-frequency  boundary  of  the  spectrum  ofh[z)  (assumed  in  the  following  bandpass). 

Then  the  free  zeros  of  h{z)  are  completely  characterized  by  the  following  three  equivalent  formula¬ 
tions: 

The  free  zeros  of/i(z)  are  such  **; 

P(,') «  0  N(z*)  =*  0  (a) 

h(a*)  =  0  P(z*)s=0  (b) 

P(z*)  =  0  P(?)  =  0  (c) 

Observe  that  if  z  is  a  zero,  i  is  also  a  zero  of  h(z);  and  if  z  is  a  zero,  z  -f-  Zfcir  k  an  integer,  is  also  a 
zero. 

The  coeflicients  C„  of  h(z)  may  be  determined  by  the  2N  roots  of  h{z)  as  the  solutions  of  the 
system  of  2N  equations 

s 

—N 


PNN 


38 


SPATIOTEMPORAL  INTERPOLATION 


N 


—N 


0 


(4) 


Let  us  now  rewrite 


N 

/i(2)  =  5^CX,e‘ 

—/V 


as 


(2N 


(5) 


with 


(  —  e",  ffn  =  C„-N,  Rl«l  =  10.  ir].  N  =  :2M 


Thus  the  nontrivial  zeros  of  h(z)  coincide  with  the  zeros  of  ^2^  ?n?".  that  is,  a  polynomial  of 
order  2N.  If  the  2N  roots  f  would  be  known,  it  would  be  possible  to  write  2N  equations  in  the 
2N  -f-  1  real  unknowns  (Cn): 


m 

0 


2S 

0nf2N  —  0 

0 


(6) 
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with 

f  =  e“ 

Since  the  determinant  of  the  roots  is  a  Vandermonde  determinanL  it  always  has  maximum  rank  if 
the  roots  are  distinct.  Ihc  question  is  under  which  conditions  the  real  roots  alone  determine,  apart 
from  a  multiplicative  constant,  the  set  of  Cn,  i.c.  h[2).  Clearly,  multiple  zeros,  in  particular  multiple 
real  zeros,  cannot  be  allowed.  Observe  that  if  more  than  2N  real  zero-crossings  would  be  available  (in 
a  basic  period)  then  h  =  0. 

Under  the  bandpass  condition  (C„  =  0  for  n  <  zl)  there  are  at  least  2A  real  zero-crossings  per 
period.  The  real  unknowns  arc  2b,  b  =  N  —  A,  that  is  the  number  of  non-zero  C„  between  N  and 
A.  counted  twice  because  they  arc  complex  numbers.  A  sufficient  condition  to  ensure  that  there  are 
enough  zero-crossings,  and  thus  equations,'  is  A  =  M  =  i.e.,  C„  (for  n  >  0)  all  non-zero  in  [M, 

2M\.  Notice  that  [Af ,  2M\  i.e.,  one  octave  bandwidth  would  not  be  sufficient:  in  this  ease  thci .  would 
be  at  least  2M  real  routs  but  2(M  -|-  1)  unknowns  C„.  The  matrix  associated  to  the  homogeneous 
equation  in  the  "roots” 

^  g— »2M«i  t(A+l)ti  g»(A+l)«i 

iJAftjM  ^ 

has  rank  at  most  2M  —  1  (since  there  exists  such  that  C„e^”'  vanishes  identically  for  z  = 

. .  .t2M)  and  this  would  just  not  suffice  to  specify  the  modulus  a  multiplicative  constant 

Although  the  less-than-l  octave  condition  is  suifkdent  to  ensure  enough  zero  crossings,  it  is  by  no 
means  necessary.  In  fact  there  arc  classes  of  bandpas  signals  with  a  larger  bandwidth  and  still  enough 
zero-crossings. 

In  any  ease,  even  when  there  is  a  sufficient  number  of  zero-crossings,  the  question  still  remains 
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of  wlicthcr  tho  determinant  of  the  matrix  of  the  “roots"  has  maximum  rank  (2M  —  I)  and 
therefore  the  C„  can  be  determined  (modulus  a  multiplicative  constant),  if  the  rank  is  less  than 
2M  —  1  then  the  C„  arc  not  uniquely  determined  and  as  a  consequence /i{2)  is  not  detcimincd  by  its 
real  roots.  Logan  (1977  and  personal  communication)  has  proved  that 

a)  if  a  free  zero  exists  then  h[z)  is  not  uniquely  determined  by  its  real  roots  and 

b)  if  there  arc  no  free  zeros,  h{z),  provided  its  bandwidth  is  appropriate,  is  determined,  modulus  a 
multiplicative  constant,  by  its  real  zero-crossings. 

In  llie  following,  we  will  outline  LjOgan's  main  theorems  for  the  case  of  trigonometric  polynomials. 

'fheorem  1 

lf/i{2)  has  1  or  more  free  zeros,  the  rank  r  of  the  determinant  of  the  roots  is  r  <;  2M  —  1. 

Proof 

h{l)  can  be  written  as 

h{t)^P{t)  +  N{t) 

0  0 
M— )  M— I 

=  JJ  (e‘‘  -  «‘*^)  +  _|.  JJ 

If  t  is  a  free  zero  of  h{l)  then  we  can  divide  h{t)  by  the  real  function 


/(<)  =  (e*'  —  «“)(«»'  —  =  (2ie^  sin  ^-y-^){2te^  sin  ^-y^)  =  A  sin  sin 

(9) 

with  A  real. 

Ihc  resulting  ^  is  still  a  periodic  bandpass  function  of  the  form 


t 
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Lii\  'zM 

—aw  M 

and  actually  of  reduced  bandwidth.  Multiplkatiun  of  ^  by  any  arbitrary  [a  —  cos(f  —  a)],  a  >  1 
which  can  be  always  written  as  Cain  ain  provides  a  periodic  bandpass  function  with  the 
same  bandwidth  as  the  original /i(t)  but  different  from  it  despite  the  same  real  zeros.  Notice  that  if  t  is 
not  a  free  zero,  ^  will  no  longer  be  a  periodic  bandpass  function.  This  means  that  the  determinant 
associated  with  the  homogeneous  equation  7  has  at  most  rank  r  =  2M  —  2. 

Theorem  2 

If  h{t)  has  no  multiple  and  no  free  zeros  the  rank  of  the  determinant  of  the  real  "roots”  is  r  = 

2M  —  1. 

Proof 

Clearly  r  cannot  be  r  >  2M  —  1.  If  hi  and  ha  have  the  same  bandwidth  and  the  same  real  zeros, 
then 

2W— i 

0 

2M— 1 

h,hj-A,Aa=  2  <12) 

0 

as  it  is  easy  to  check  by  substitution  of  equation  (2).  If  the  real  zeros  are  2M  in  number  and  distinct, 
the  Vandermonde  determinant  associated  to  the  real  roots  of  equation  12  is  different  ffom  zero;  thus, 
the  unknowns  are  identically  zero.  The  same  argument  implies  that  all  Pn  are  also  identically  zero. 
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Now  M[t)  is  any  function  with  the  same  zeros  (real  and  complex)  of  hi.  But  hi  is  a  bandlimitcd 
function  hi{t)  =  which  is  uniquely  determined  (apart  from  a  muliiplicativc  constant) 

by  its  4M  real  and  complex  zeros.  Thus  hi  and  /ij  must  coincide  identically  and  the  theorem  follows. 
ITie  theorem  can  be  generalized  allowing  for  real  zeros. 

Finally,  a  short  remark  about  the  multiple  and  free  zero  condition.  It  is  rather  intuitive  that  mul¬ 
tiple  and  free  zeros  arc  not  generic;  assume,  for  instance,  that  the  polynomial  ^ 

free  zero.  It  is  enough  to  perturb  one  of  the  coefficients  C„  to  annihilate  the  free  zero.  Similarly,  if 
the  trigonometric  polynomial  is  a  sample  function  of  a  random  process,  the  coefficients  C„  would  be 
random  numbers,  as  well  as  the  zeros  of  the  assiKiatcd  polynomial  —  C>)-  probability  that 

a  zero  is  free  (i.e.  with  fi  =  p  f,  is  free  iff  is  also  a  zero)  is  usually  very  low. 


Appendix  lb 

Logan's  result  can  be  extended  to  the  case  of  a  two-dimensional  entire  function  /(i,  y)  if  it  is 
bandpass  in  i  with  a  band-width  strictly  less  than  an  octave  and  band-limited  in  y  .  In  this  ease,  the 
rcsuiction  of  /  to  a  one-dimensional  line  f*  in  the  x,  y  plane  parallel  to  the  r  axis  will  be  bandpass 
with  less  than  an  octave  band-width.  Provided  the  free-zero  condition  is  meu  Logan’s  theorem  tells 
us  that  the  zeros  of  /  along  lx  determine  /  there  up  to  a  multiplicative  constant.  To  determine  / 
everywhere  up  to  a  multiplicative  constant,  these  parallel  slices  must  be  tied  together. 

The  following  lemma  shows  that  lagan's  theorem  can  be  invoked  for  /  restricted  to  a  line  lo  which 
is  not  parallel  to  the  X  axis.  1$  will  intersect  all  slices  (*  parallel  to  the  z  axis,  so  determining  /  up  to  a 
multiplicative  constant  on  lo  determines  /  up  to  the  same  constant  along  each  of  the  slices/,. 

Lemma 

If^/(^>  y)  ideally  bandpass  with  band-width  strictly  less  than  an  octave  in  z  and  band-limited  in  y 
then  there  is  an  z  >  0  such  that  /  along  all  slices,  1$  which  make  an  angle  0  <f.  with  tlie  X  axis,  will 
be  bandpass  with  band-width  less  than  an  octave. 
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Proof 

The  support  of  the  Fourier  transform  of  /  is  confined  in  w,  to  the  intervals  I\  =  (— 2o  6,  — o  — 

6)  and  /2  —  (a  -(-  6, 2a  —  f )  and  in  cOy  to  the  interval  J  ==  ( — b,  b)  for  some  positive  6,  a,  and  6. 
Observe  that  the  support  of  the  Fourier  transform  of  a  slice  I  through  /  is  confined  to  the  projection 
of  the  support  of  the  Fourier  transform  of  /  onto  the  u*  axis.  The  rectangles  Ii  >C  J  and  h  X  J  will 
project  into  the  intervals  (—2a,  a)  and  (a,  2a)  on  C  provided  that  I  makes  a  sufficiently  small  angle 


with  the  X  axis. 
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Appendix  2 


We  consider  a  one  dimensional  pattern  g{z).  Arbitrary,  non  rigid  movement  of  Iliis  pattern  produces 
a  spatiotemporal  image  g{x,  t).  Rigid  movement  of  the  same  pattern  at  constant  speed  gives  an  image 
g{x,  t)  =  g(x  —  vt).  We  state  here  the  classical  sampling  theorem  for  the  fiist  ease  and  an  appropriate 
modification  of  it  for  the  second  case. 

llicorein  1  (classical  sampling  theorem) 

If  a  signal  g(x,  t)  is  bandlimited  in  spatial  and  temporal  frequencies  it  can  be  recovered  exactly  by 
independent  interpolation  in  space  and  time  of  its  sampled  values,  provided  that  tlie  sampling  separa¬ 
tions  Af  and  At  are  such  that  Af  <  l/2f%  and  At  <  1/2/^.  where  and  arc  the  spatial  and 
temporal  bandwidths. 

'fheorem  2  (Crick  ct  al.,  1981;  Fahic  &  Po^io,  1981) 

Assume  that  the  spatiotemporal  signal  g(x,  t)  =  g{x  —  vt).  The  function  g  can  then  be  reconstructed 
at  the  desired  resolution  from  its  spatial  (temporal)  samples.  The  required  sampling  density  can  be 
decreased  arbitrarily  by  knowledge  of  the  velocity  v.  If  only  the  sign  of  the  velocity  is  available  the 
maximum  sampling  distance  can  be  twice  the  classical  limit  for  stationary  patterns. 

Comments 

a)  The  proof  of  these  results  can  be  easily  obtained  from  diagrams  in  the  /*•  —  ft  Fourier  plane  (see 
Fig.  9;  Crick  ct  al,  1981). 

b)  Theorem  1  requires  the  function  g(x,  t)  to  be  bandlimited  before  sampling  takes  place,  since 
overlap  of  the  frequency  lobes  as  an  effect  of  sampling  usually  leads  to  an  irretrievable  loss  of  infor¬ 
mation.  Ihis  condition  is  not  needed  in  theorem  2.  Overlap  never  occurs  (for  infinitely  long  motion) 
even  when  the  pattern  /(*)  is  not  bandlimited  in  spatial  frequency.  Any  desired  part  of  the  original 
spectrum  can  be  recovered  exactly  (without  aliasing)  by  an  appropriate  interpolation  filter. 

c)  The  spatiotemporal  filter  implementing  the  interpolation  depends  on  v.  Assume,  for  instance,  to 
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endow  an  interpolation  scheme  with  direction  selective  properties  (i.e.  to  use  information  about  the 
sign  of  v):  it  can  be  shown  that  the  new  spatiotemporal  filter  is  obtained  by  adding  to  the  spatiotem- 
poral  impulse  response  its  Hilbert  transform  with  a  sign  controlled  by  the  sign  of  v  (in  the  case  of 
Fig.8  the  Hilbert  transform  of  the  spatial  point  spread  function  is  an  odd  function). 
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